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Abstract. We define the class of simple recursive games. A simple re- 
cursive game is defined as a simple stochastic game (a notion due to 
Anne Condon), except that we allow arbitrary real payoffs but disallow 
moves of chance. We study the complexity of solving simple recursive 
games and obtain an almost-linear time comparison-based algorithm for 
computing an equilibrium of such a game. The existence of a linear time 
comparison-based algorithm remains an open problem. 



1 Introduction 

Understanding rational behavior in infinite duration games has been an impor- 
tant theme in pure as well as computational game theory for several decades. A 
number of central problems remain unsolved. In pure game theory, the existence 
of near-equilibria in general-sum two-player stochastic games were established 
in a celebrated result by Vieille |18I19| . but the existence of near-equilibria for 
the three-player case remains an important and elusive open problem [5]. In 
computational game theory, Condon [1] delineated the efficient computation of 
positional equilibria in simple stochastic games as an important task. While 
Condon showed this task to be doable in NP n coNP, to this day, the best 
deterministic algorithms are not known to be of subexponential complexity. To 
the computer science community, the problem of computing positional equilib- 
ria in simple stochastic games is motivated by its hardness for finding equilibria 
in many other natural classes of games [20] . which again implies hardness for 
tasks such as model checking the /i-calculus [51 , which is relevant for the formal 
verification of computerized systems. 



1.1 Simple Stochastic Games 

A simple stochastic game [1] is given by a graph G — {V, E). The vertices in V 
are the positions of the game. Each vertex belongs either to player Max, to player 
Min, or to Chance. There is a distinguished starting position vq. Furthermore, 
there are a number of distinguished terminal positions or just terminals, each 
labeled with a payoff from Min to Max0 All positions except the terminal ones 

In Condon's original paper, there were only two terminals, with the payoffs and 1. 
The relaxation to arbitrary payoffs that we adopt here is fairly standard. 



have outgoing arcs. The game is played by initially placing a token on vq, letting 
the token move along a uniformly randomly chosen outgoing arc when it is in a 
position belonging to Chance and letting each of the players decide along which 
outgoing arc to move the token when it is in a position belonging to him. If a 
terminal is reached, then Min pays Max its payoff and the game ends. Infinite 
play yields payoff 0. A positional strategy for a player is a selection of one outgoing 
arc for each of his positions. He plays according to the strategy if he moves along 
these arcs whenever he is to move. It is known (see [ij that each position p in 
a simple stochastic game can be assigned a value Val(p) so that: 

1. Max has a positional strategy that, regardless of what strategy Min adopts, 
ensures an expected payoff of at least Val(p) if the game starts in p. 

2. Min has a positional strategy that, regardless of what strategy Max adopts, 
ensures that the expected payoff is at most Val(p) if the game starts in p. 

The value of the game itself is the value of vq. Condon considered the complexity 
of computing this value. It is still open if this can be done in polynomial time. 
In the present paper, we shall look at some easier problems. For those, we want 
to make some distinctions which are inconsequential when considering whether 
the problems are polynomial time solvable or not, but important for the more 
precise (almost linear) time bounds that we will be interested in in this paper. 

— A weak solution is Val(uo) and a positional strategy for each player satisfying 
the conditions in items [T] and [2] above for p = vq. 

— A strong solution is the list of values of all positions in the game and a po- 
sitional strategy for each player that for all positions p, ensures an expected 
payoff of at least/most Val(p) if the game starts in p. 

In game theory terminology, weak solutions to a game are Nash equilibria of the 
game while strong solutions are suhgame perfect equilibria. Figure [T] illustrates 
that the distinction is not inconsequential. In the weak solution to the left. Max 




Fig. 1. The left solution is weak, the right is strong. 



(if he gets to move) is content to achieve payoff 0, the value of the game, even 
though he could achieve payoff 1. Note that the game in Figure [T] is acyclic. In 



contrast to the general case, it is of course well known that a strong solution 
to an acyclic game can be found in linear time by straightforward dynamic 
programming (known as backwards induction in the game theory community). 
We shall say that we weakly (resp. strongly) solve a given game when we provide 
a weak (resp. strong) solution to the game. Note that when talking about strong 
solutions, the starting position is irrelevant and does not have to be specified. 

1.2 Simple Recursive Games 

Condon established that for the case of a simple stochastic game with no Chance 
positions and only 0/1 payoffs, the game can be strongly solved in linear time. In- 
terestingly, Condon's algorithm has been discovered and described independently 
by the artificial intelligence community where it is known under the name of ret- 
rograde analysis [VI^ . It is used routinely in practice for finding optimal strategies 
for combinatorial games that are small enough for the game graph to be repre- 
sented in (internal or external) memory and where dealing with the possibility of 
cycling is a non-trivial aspect of the optimal strategies. The best known example 
is the construction of tables for chess endgames [9]. Condon's algorithm (and ret- 
rograde analysis) being linear time depends crucially on the fact that the games 
considered are win/lose games (or, as is usually the case in the AI literature, 
win/lose/draw games), i.e., that terminal payoffs are either or 1 (or possibly 
also —1, or in some AI examples even a small range of integers, e.g., [16 ). In this 
paper we consider the algorithmic problem arising when arbitrary real payoffs 
are allowed. That is, we consider a class of games similar to but incomparable 
to Condon's simple stochastic games: We disallow chance vertices, but allow 
arbitrary real payoffs. We call the resulting class simple recursive games^ 

Some simple examples of simple recursive games are given in Figure [H In (a), 
the unique strong solution is for Min to choose right and for Max to choose left. 
Thus, the outcome is infinite play. In (b) , the unique strong solution is for Min to 




(a) (b) 

Fig. 2. (a) Infinite play equilibrium, (b) All values are 1, but one choice is 
suboptimal. 

choose right and for Max to choose right. The values of both vertices are 1, but 
we observe that it is not a sufficient criterion for correct play to choose a vertex 
with at least as good a value as your current vertex. In particular, according to 



^ The perfect information, no chance special case of Everett's "recursive games" [7]. 



this criterion, Max could choose left, but this would lead to infinite play and a 
payoff of 0, which is a suboptimal outcome for Max. 

We observe below that if a sorted list of the payoffs (with pointers to the 
corresponding terminals of the game) is given in advance, optimal strategies can 
again be found in linear time without further processing of the payoffs. From 
this it follows that a simple recursive game with n payoffs and m arcs in the 
graph can be solved in time 0{nlogn + m) by a comparison-based algorithm. 
The main question we attempt to approach in this paper is the following: 

Main Open Problem. Can a simple recursive game be (weakly or strongly) 
solved in linear time by a comparison-based algorithm? 

We believe this to be an interesting question, both in the context of game solving 
(simple recursive games being a very simple yet non-trivial natural variant of the 
general problem) and in the context of the study of comparison-based algorithms 
and comparison complexity. This paper provides neither a positive nor a negative 
answer to the question, but we obtain a number of partial results, described in 
the next subsection. 

1.3 Our Results 

Throughout this section we consider simple recursive games with n denoting 
the number of terminals (i.e., number of payoffs) and m denoting the total size 
(i.e., number of arcs) of the graph defining the game. We can assume to > n, as 
terminals without incoming arcs are irrelevant. 

Strategy Recovery in Linear Time. The example of Figure [2] (b) shows 
that it is not completely trivial to obtain a strong solution from a list of values 
of the vertices. We show that this task can be done in linear time, i.e. time 
0{m). Thus, when constructing algorithms for obtaining a strong solution, one 
can concentrate on the task of computing the values Val(p) for all p. Similarly, 
we show that given the value of just the starting position, a weak solution to the 
game can be computed in linear time. 

The Number of Comparisons. When considering comparison-based algo- 
rithms, it is natural to study the number of comparisons used separately from 
the running time of the algorithm (assuming a standard random access machine). 
By an easy reduction from sorting, we show that there is no comparison-based 
algorithm that strongly solves a given game using only 0{n) comparisons. In 
fact, n{n\ogn) comparisons are necessary. In contrast, Mike Paterson (personal 
communication) has observed that a simple recursive game can be weakly solved 
using 0(n) comparisons and O(TOlogn) time. With his kind permission, his al- 
gorithm is included in this paper. This also means that for the case of weak 
solutions, our main open problem cannot be solved in the negative using cur- 
rent lower-bound techniques, as it is not the number of comparisons that is the 
bottleneck. Our lower bound uses a game with to = 0(nlogn) arcs. Thus, the 



following interesting open question concerning only the comparison complexity 
remains: Can a simple recursive game be strongly solved using 0{m) compar- 
isons? If resolved in the negative, it will resolve our main open problem for the 
case of strong solutions. 

Almost-Linear Time Algorithm for Weak Solutions. As stated above, 
Mike Paterson has observed that a simple recursive game can be weakly solved 
using 0{n) comparisons and 0(m log n) time. We refine his algorithm and ob- 
tain an algorithm that weakly solves a game using 0(n) comparisons and only 
O(TOloglogn) time. Also, we obtain an algorithm that weakly solves a game in 
time 0{m + m(log* m — log* ^)) but uses a superlinear number of comparisons. 
For the case of strongly solving a game, we have no better bounds than those de- 
rived from the simple algorithm described in Section ri.21 i.e., 0{m-\-n\ogn) time 
and O(nlogn) comparisons. Note that the bound 0(m -I- m(log* m — log* ^)) 
is linear in m whenever m > n log log. . Aogn for a constant number of 'log's. 
Hence it is at least as good a bound as 0{m + nlogn), for any setting of the 
parameters m,n. 

2 Preliminaries 

Definition 1. A simple recursive game (SRG) is a digraph with vertices parti- 
tioned into sets of non-terminals Vuin and Viviax, which are game positions where 
player Min and Max, respectively, chooses the next move (arc), and terminals T , 
where the game ends and Min pays Max the amount specified by p : T ^ H. □ 

For simplicity, we will assume that terminals have distinct payoffs, i.e., that 
p is injective. We can easily simulate this by artificially distinguishing terminals 
with equal payoffs in some arbitrary (but consistent) fashion. We will also assume 
that m > n, since terminals without incoming arcs are irrelevant. 

Definition 2. We denote by Yalciv) the value of the game G when the vertex 
V is used as the initial position and infinite play is interpreted as a zero payoff. 
This will also be called "the value ofv(inG)". □ 

Remark 1. That such a value indeed exists will follow from Proposition [T] We 
shall later see how to construct optimal strategies from vertex values. 

Definition 3. To merge a non-terminal v with a terminal t is to remove all 
outgoing arcs of v, reconnect all its incoming arcs to t, and then remove v. □ 

The definitions of a strong and a weak solution are as stated in the introduction. 
The following algorithm is a generalization of Condon's linear time algorithm 
[Ij for solving simple recursive games with payoffs in {0, 1}. That algorithm is 
known as retrograde analysis in the AI community [17] . and we shall adopt this 
name also for this more general version. 



Proposition 1. Given an SRG and a permutation that orders its terminals, 
we can find a strong solution to the game in linear time and using no further 
comparisons of payoffs. 

Proof. If all payoffs are 0, then all values are and every strategy is optimal. 

Suppose that the minimum payoff p{t) is negative. Any incoming arc to t 
from a Max-vertex that is not the only outgoing arc from that vertex is clearly 
suboptimal and can be discarded. Each other incoming arc is an optimal choice 
for its source vertex, which can therefore be merged with t. Symmetric reasoning 
applies when the maximum payoff is positive. □ 

This immediately yields the sorting method for strongly solving SRGs: First sort 
the payoffs, and then apply Proposition [Ij 

Corollary 1. An SRG with m arcs and n terminals can be strongly solved in 
0{m + nlogn) time. □ 

Definition 4. To merge a terminal s with another terminal t is to reconnect all 
incoming arcs of s tot and then remove s. Two terminals are adjacent if their 
payoffs have the same sign and no other terminal has a payoff in between. □ 

The following lemma states the intuitive fact that when we merge two adja- 
cent terminals, the only non-terminals affected are those with the corresponding 
values, and they acquire the same (merged) value. 

Lemma 1. If G' is obtained from the SRG G by merging a terminal s with an 
adjacent terminal t, then for each non-terminal v, we have 



Proof. Consider all SRGs with a fixed structure (i.e., underlying graph) but with 
varying payoffs. Since a strong solution can be computed by a comparison-based 
algorithm (Proposition[T]) , the value of any particular position v can be described 
by a min/max formula over the payoffs. The claim of the lemma can be seen to 
be true by a simple induction in the size of the relevant formula. 

By repeatedly merging adjacent terminals, we "coarsen" the game. Figure [3] 
shows an example of this. The partitioning method we shall use to construct 
coarse games in this paper also yields sorted lists of their payoffs. Hence, we 
shall be able to apply retrograde analysis to solve them in linear time. 

Corollary 2. The signs of the values of all vertices in a given SRG can be 
determined in linear time. 

Proof. Merge all terminals with negative payoffs into one, do likewise for those 
with positive payoffs, and then solve the resulting coarse "win/lose/draw" game 
by retrograde analysis. □ 




(1) 



□ 



Fig. 3. Coarsening by merging {—4, —1} and {2, 3, 5}. 



Clearly, arcs between vertices with different values cannot be part of a strong 
solution to a game. From this, the following lemma is immediate. 

Lemma 2. In an SRG, removing an arc between two vertices with different 
values does not affect the value of any vertex. □ 

Remark 2. Corollary [21 Lemma [21 and symmetry together allow us to restrict 
our attention to games where all vertices have positive values, as will be done in 
subsequent sections. 

Proposition 2. Given the value of the initial position of an SRG, a weak so- 
lution can be found in linear time. If the values of all positions are known, a 
strong solution can be found in linear time. 

Proof. In the first case, let y be the value of initial position vq . We partition pay- 
offs in at most five intervals: (— oo, min(y, 0)), {min(?/, 0)}, (min(j/, 0), max(y, 0)), 
{max(j/, 0)} and (max(j/, 0), oo). We merge all terminals in each of the intervals, 
obtaining a game with at most five terminals. A strong solution for the resulting 
coarse game is found in linear time by retrograde analysis. The pair of strategies 
obtained is then a weak solution to the original game, by Lemma [T] 

In the second case, by Lemma [21 we can first discard all arcs between vertices 
of different values. This disintegrates the game into smaller games where all 
vertices have the same value. We find a strong solution to each of these games in 
linear time using retrograde analysis. Combining these solutions in the obvious 
way yields a strong solution to the original game, by Lemma [21 □ 

3 Solving Simple Recursive Games 
3.1 Strongly 

For solving SRGs in the strong sense, we currently know no asymptotically faster 
method than completely sorting the payoffs. Also, the number of comparisons 
this method performs is, when we consider bounds only depending on the number 
of terminals n, optimal. Any sorting network [12] can be implemented by an 
acyclic SRG, by simulating each comparator by a Max-vertex and a Min- vertex. 
Figure m shows an example of this. Thus, we have the following tight bound. 



Proposition 3. Strongly solving an SRG with n terminals requires 0{n\ogn) 
comparisons in the worst case. □ 




Fig. 4. Implementing a sorting network by a simple recursive game. 



Implementing the asymptotically optimal AKS-network [P results in a game 
with 0{n\ogn) vertices and arcs. Thus, it is still consistent with our current 
knowledge that a game can be strongly solved using 0{m) comparisons. 

3.2 Weakly 

The algorithms we propose for weakly solving SRGs all combine coarsening of 
the set of payoffs with retrograde analysis. By splitting the work between these 
two operations in different ways, we get different time/comparison trade-offs. At 
one extreme is the sorting method. At the other, we partition the payoffs around 
their median (which can be done in linear time by Blum et al. 3J), use retrograde 
analysis to solve the coarse game obtained by merging the terminals in each half, 
and then discard the irrelevant half of the terminals (the one not containing the 
value of the starting vertex) and all vertices with the corresponding values. This 
method, which is due to Mike Paterson, uses the optimal 0{n) comparisons, but 
requires Oilogn) iterations, each with a worst case running time of 0{rn). 

0{n) Comparisons and 0{m log log n) Time. To improve the running time 
of Paterson's algorithm, we stop and sort the remaining terminals as soon as this 
can be done in 0{n) time. The number of comparisons is still 0{n). As noted in 
Section [21 we may assume that all vertices have positive values. 

Algorithm. Given an SRG G with m arcs, n terminals, and starting position vo, 
do the following for i = 0, 1, 2, . . . 

1 . Partition the current set of terminals around their median payoff. 

2. Solve the coarse game obtained by merging the terminals in each half. 

3. Remove all vertices that do not have values in the half containing Valcivo). 

4. Undo step [T] for the half of Vq. 

When mlogrii < n, stop and solve the remaining game by the sorting method. 



Analysis. Steps [TH?] can be performed in 0{m) time and 0{ni) comparisons. 
The number of iterations is 0(log7i — log/(7i)), where f{n) is the inverse of 
n I— > 71 log n, and since this equals 0(loglog7i) we have the following. 

Theorem 1. An SRG with m arcs and n terminals can he weakly solved in 
0(m log log n) time and 0{n) comparisons. □ 



Almost-Linear Time. We can balance the partitioning and retrograde analysis 
to achieve an almost linear running time, by a technique similar to the one used 
in [8] and later generalized in [15] H Again, we assume that all vertices have 
positive values. 

Algorithm. Given an SRG G with m arcs, n terminals, and starting position wq, 
do the following for i = 0, 1, 2, . . . 

1. Partition the current set of Ui terminals into groups of size at most 71^/2™/"'. 

2. Solve the coarse game obtained by merging the terminals in each group. 

3. Remove all vertices having values outside the group of ValG(^^o)- 

4. Undo step [1] for the group of vq. 

When r7i/2™/"' < 1, stop and solve the remaining game by the sorting method. 

Analysis. All steps can be performed in 0{m) time. For the first step we can do 
a "partial perfect quicksort" , where we always partition around the median and 
stop at level \m/ni \ + 1. 

To bound the number of iterations, we note that Ui satisfies the recurrence 

7i,+i < n,/2"/"" , (2) 

which by induction gives 




where b — 2™/". Thus, the number of iterations is 0(log^ n), where log^ denotes 
the number of times we need to apply the base b logarithm function to get below 
1. This is easily seen to be the same as 0(1 + log* tti — log* ^). We have now 
established the following. 

Theorem 2. An SRG with m arcs and n terminals can be weakly solved in 
0{m + m{log* m — log* ^)) time. □ 

Remark 3. When m = n{n\og^''^ n) for some constant k, this bound is 0(m). 
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^ Note, however, that while the technique is similar, the problem of solving simple 
recursive games does not seem to fit into the framework of [15j . 
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